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Abstract — We show that it can be suboptimal for Bayesian 
decision-making agents employing social learning to use correct 
prior probabilities as their initial beliefs. We consider sequential 
Bayesian binary hypothesis testing where each individual agent 
makes a binary decision based on an initial belief, a private signal, 
and the decisions of all earlier-acting agents — with the actions of 
precedent agents causing updates of the initial belief. Each agent 
acts to minimize Bayes risk, with all agents sharing the same 
Bayes costs for Type I (false alarm) and Type II (missed detection) 
errors. The effect of the set of initial beliefs on the decision- 
making performance of the last agent is studied. The last agent 
makes the best decision when the initial beliefs are inaccurate. 
When the private signals are described by Gaussian likelihoods, 
the optimal initial beliefs are not haphazard but rather follow 
a systematic pattern: the earlier-acting agents should act as if 
the prior probability is larger than it is in reality when the true 
prior probability is small, and vice versa. We interpret this as 
being open minded toward the unlikely hypothesis. The early- 
acting agents face a trade-off between making a correct decision 
and being maximally informative to the later-acting agents. 

Index Terms — Bayesian hypothesis testing, distributed detec- 
tion, human decision making, likelihood ratio tests, sequential 
decision making, social learning, social networks, team theory. 

I. Introduction 

Consider decision-making agents facing the task of choos- 
ing between two alternatives. Each agent has a private signal, 
which is not visible to the other agents. The agents sequentially 
make their individual decisions, which are visible to other 
agents. An agent's action contains some information about the 
right (or better) choice, so subsequent agents can learn from 
the action and reflect it in their own actions. For example, 
when you want to choose between two alternatives when 
buying a new phone, the choices made by your colleagues 
can affect your judgment. 

Being influenced by earlier-acting agents has been termed 
social learning JT1. It has generally been studied in settings 
where each agent has no motivation beyond making a correct 
choice for himself. In this paper, we study the effect of an 
agent's action on subsequent agents and find that making 
correct decisions is generally not equivalent to providing 
information to other agents that maximally benefits them in 
their decision making tasks. Accounting for the effect on other 
agents could be termed social teaching. Also, in any scenario 
with social learning, the earlier-acting agents can be seen as 
advisers to the later-acting agents. As will be detailed later, we 
find that a good adviser should be open minded in the sense 
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of being more receptive to the a priori less likely alternative 
than she would have been if she were only interested in being 
right rather than also interested in being informative. 

The framework of sequential decision making with social 
learning was independently introduced in p) and B). These 
works focused primarily on herding, which is for all agents 
beyond some index to take the same action. They showed that 
an incorrect herd would arise with positive probability when 
private signals are boundedly informative^ For example, the 
private signals were assumed to be binary and to give true or 
false information, each with positive probability. It can happen 
that a couple of the first agents receive false private signals and 
thus choose wrong actions. Then the effect of these actions on 
the beliefs of subsequent agents can be so great as to cause 
them to ignore their private signals and follow their precedent 
agents. The private signals are bounded so that they cannot be 
strong enough to overcome the effect of the wrong actions. 

Subsequently, [4| showed that learning is incomplete — 
beliefs are not eventually focused on the true state — if private 
signals are boundedly informative, but agents will asymptot- 
ically settle on the optimal action otherwise. Recently, J5J 
extends the result to general network topologies where each 
agent can observe decisions made by its neighbors instead of 
all previous agents. 

In another related line of work, |6| studies the effect of 
social learning in a quickest detection problem, in which 
agents keep updating their beliefs based on previous decisions 
and detect the time at which an underlying state changes. It 
has a similar framework to J7J, which studied update of private 
information in a finite memory. 

This paper differentiates itself from the literature in that 
it considers unbounded private signals and does not focus 
on herding behavior. In addition, we focus largely on the 
effect of prior probabilities in decision making. We do not 
assume that an agent knows a correct prior probability for the 
decision at hand. Even if he does, we do not assume he takes 
the shortsighted approach of using the prior probability only 
to optimize the correctness of his own decision. Instead, we 
study the effect of the prior probability on the decisions of 
subsequent agents. Sequential decision making is considered 
from a signal-processing perspective in [8 1 as well. Its model is 
similar to ours except that there all agents know the true prior 
probability; this difference changes the problem substantially. 

Our criterion for optimality is the Bayes risk of the final 
agent; we assume a sequential decision making model in 

'A signal Y generated under a state H is called boundedly informative if 
there exists re > such that re < f Y | jj(v I ^) < V K f° r a 'l V an d h- 
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which only the decision made by the final agent matters. Since 
sequential decision making is a hypothesis testing problem, 
agents adopt likelihood ratio tests to choose their actions (9). 
As they observe decisions or actions chosen by precedent 
agents, they compute or update their beliefs to perform more 
precise likelihood ratio tests. The update process depends on 
the initial prior belief and the history of decisions. We derive 
a recursive belief update function. 

Bayes-optimal agents need to know the prior probability in 
order to perform the likelihood ratio test. Hence, intuitively, 
agents with wrong prior beliefs should degrade the decision 
making and yield higher Bayes risk. In addition, they would 
misunderstand the public signals because they do not know 
others' beliefs. 

Contrary to intuition, it turns out that wrong beliefs may 
improve the decision made by the final agent in sequential 
decision making. Especially when the private signals are 
distorted by additive Gaussian noise, the optimal first agent 
is open minded: He acts as if the prior probability is larger 
than it is in reality when the true prior probability is small, 
and vice versa. 

Section [TT] provides additional background and motivation 
from human decision makers. Section [Til] describes our se- 
quential decision making model. In Section IV we investigate 



how agents interpret the decision history, update their beliefs, 
and make decisions according to their positions in the chain of 
agents. It is proven in Section [V] that the true prior probability 
is not the optimal prior belief for N = 2. Examples for 



Gaussian likelihoods are presented in Section VI Section VII 
concludes the paper. 

II. Background 

The mathematical model presented herein abstracts human 
decision makers so as to be broadly applicable. We are 
motivated in part by a study about the correlation between 
a defendant's physical appearance and juror decisions |10|. 
It states that jurors feel a defendant more intelligent when 
the defendant is wearing eyeglasses, which leads to fewer 
guilty verdicts. It also says that wearing eyeglasses is espe- 
cially effective for African- American defendants. Several other 
studies have also revealed the importance of a defendant's 
physical appearance on a jury's decision making JTT)-p3). 
From a logical standpoint, a defendant's appearance should be 
irrelevant to judical decisions because eyeglasses have nothing 
to do with crimes, and we believe that jurors do their best to 
make fair and reasonable decisions based only on evidence. 
Then why does this happen in reality? 

We find the answer in Bayesian reasoning and the concept of 
prior probability. Let us liken a jury trial to a hypothesis testing 
problem. A defendant is metaphorically an object in one of two 
states: guilty or not guilty. Perceptions of evidence presented 
by a prosecutor or defense counsel are noisy observations 
about the defendant's true state. Jurors are detectors that make 
a decision based upon the noisy observations. However, one 
element of hypothesis testing is missing: the prior probability 
that the defendant commits a crime. 

Reasonable human beings resemble Bayesian decision mak- 
ers p4)-p7); they need to know the prior to compute the 
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Fig. 1. A sequential decision making model with N agents (Alexis, Blake, 
. . . , Norah). The nth agent can observe n — 1 decisions made by the precedent 
agents. 



posterior probability. The prior probability can be critical to 
the verdict when the evidence at trial is ambiguous. The 
problem is that the jurors cannot know the defendant's true 
prior probability. Hence, before reaching a verdict, the jurors 
judge the defendant's prior probability by how intelligent, how 
attractive, how friendly, and how threatening the defendant 
"looks." They may be able to estimate the prior probability 
close to the true value but their estimates would not be the 
same as the defendant's true prior probability. 

We are not defending or criticizing this phenomenon but 
just focusing on an interesting issue raised by it: Human 
agents perform Bayesian hypothesis testing with inaccurate 
knowledge of prior probabilities. While the prior probability 
is one of the basic elements of estimation, the effect of 
accuracy of the prior probability has not received a great deal 
of attention. Initially building upon fT8) , we have previously 
studied the effect of categorization of problem instances as 
inducing quantization of prior probabilities |[T9j-p2|. The 
present paper is more fundamental in that it addresses whether 
accurate prior probabilities are even the most favorable. We 
have found that inaccurate perception of the prior probability 
may be beneficial in sequential decision making. 

III. Problem Description 

Consider the sequential decision making model depicted in 
Fig. [T] There is an object in a binary state H e {0, 1} with 
probability ¥({H = 0}) = p and ¥({H = 1}) = 1 -p . There 
are also N agents that sequentially detect the state. Agents 
do not know the true prior probability of the object. Instead, 
the nth agent perceives it as q n . The nth agent observes 
decisions made by precedent agents, {Hi, . . . , H n _i}, as well 
as a signal about H, Y n , generated from a likelihood function 
f y n I h fl The private signals {Y n } are conditionally indepen- 
dent given H and are identically distributed. We assume 
that the likelihood ratio fy n \ niUn | i?(2/n |0) is an 

increasing function of y n . 

The nth agent can extract some information from the n - 1 
precedent decisions. The decisions are, however, biased by the 
false impressions that the precedent agents have of the object. 
Even worse, the nth agent does not know what q\, . . . , q n -\ 
are. Thus, the nth agent assumes that they are all equal to 
q n and interprets the precedent decisions accordingly. Using 
the history of decisions, the nth agent updates its prior belief 

2 The signal in this model has a continuous value while the corrupted signal 
has a discrete value in the initially-proposed framework |2j, J5J. 
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before applying its likelihood ratio test. We will define a recur- 
sive function that describes the belief update in Section 



IV 



Our interest is in the last agent Norah and her decision H^. 
Upon observing her private signal Y/v and the N-l precedent 
decisions Hi,... , Hm-i, she determines her decision rule. We 
evaluate the decision rule by a common criterion, Bayes risk, 
which measures the expected cost of her decision. The relative 
importance of correct decisions and errors can be abstracted 
as a cost function c(H,H), which defines penalties for false 
alarm or Type I error (choosing H = 1 when H = 0), correct 
rejection (choosing H = when H = 0), hit (choosing H = 1 
when H = 1), and missed detection or Type II error (choosing 
H = when H = 1). For simplicity, we assume the correct 
decisions have zero cost and use the shorthand notations cio = 
c(l,0) and Cqi = c(0, 1) respectively for costs of false alarms 
and missed detections. In addition, we consider agents have 
the same costs; they are a team in the sense of Radner J23) . 
Then the Bayes risk is given by 

Rn = c 10 PoPh n i H (l 1 0) + c 01 (l-p )ps N \ H (Q 1 1). (1) 

The computation of ([TJ depends on the previous decisions 
Hi, . . . , i?iv~i- Therefore, the correct computation of the 
expected cost is 

Rn= _ E ( C 10PoPH N M N - U ...,H 1 \ H ( 1 >hN-l,--.,hi\0) 

/ll,..../ljV-l 

+ooi (1 -Pa)P8 N ,8 N - 1 ,..,H 1 1^ (0, hjv-i, • ■ • M 1 1)) . 

(2) 

We will discuss the optimal values of q n that minimize |2]). 

It is important to note that, in our model, each agent uses 
a decision rule optimized for her own belief; the agents do 
not adjust their decision rules for the sake of Norah. In other 
words, for all n = 1, . . . , N, the nth agent adopts the decision 
rule that minimizes her Bayes risk R n , and her decision is 
shown to the other agents as a public signal. In contrast, 
the agents could adjust their decision rules in an attempt to 
minimize the Bayes risk of a single collective decision, as 
studied for the combination of social learning and aggregation 
by voting in |24) . 

We now introduce additional notation for the rest of the pa- 
per. Random variables are in uppercase while their realizations 
are in lowercase. We denote a probability density function 
(pdf) of a continuous random variable as / and a probability 
mass function (pmf) of a discrete random variable as p. A 
subscript number n means "of the nth agent." Superscript 
alphabet A (B) means "upon observing Alexis's (Blake's) 
decision"; we sometimes use or 1 instead of the Roman 
alphabet to specify a decision value. For example, qtf denotes 
the updated belief of the third agent, Chuck, upon observing 
Alexis's and Blake's decisions Hi and H 2 , and denotes 
Chuck's updated belief upon observing Hi = 1 and H2 = 0. 
Subscript alphabet A (B) means "that Alexis (Blake) thinks." 

3 We use the term belief to distinguish from the true prior probability and to 
capture that it is what agents believe as the prior probability. Agents initially 
perceive the prior probability in some way, which we call the prior beliefs. 
After they observe precedent decisions, they modify the prior beliefs. The 
beliefs are then called updated. 



For example, Blake thinks that the probability of Alexis 
choosing when the true state is is | # (0 ] 0) B . We 
need to clarify who thinks it because the agents are not aware 
of others' prior beliefs. This will be explained in detail in 
Section llV^Bl 

IV. Prior Belief Update and Decision Making 

Our model assumes unbounded private signals. Thus, unlike 
in Q, |3j, it is always possible that a subsequent agent may 
not follow previous decisions; that is, incorrect herding does 
not occur. We now discuss the utilization of a decision history 
as well as private signals for Bayesian hypothesis testing. A 
position-wise decision-making strategy will be provided. This 
can be interpreted as each agent updating his prior belief based 
on the decision history and then applying a likelihood ratio test 
with his private signal. 

A. Alexis, the First Agent 

Alexis performs normal binary hypothesis testing because 
she has no precedent decision. She use the following likelihood 
ratio test with her prior belief qi\ 



Aiigfr/ili) Hliv > l)=1 c w gi 

/Yi|ff(Vi|0) Si( < )=0 coi(l-<?i)' 



(3) 



Since the likelihood ratio is increasing in yi, the likelihood 
ratio test can be simplified to comparison with an appropriate 
decision threshold: 



Hi(i/i)=l 
Si(j/i)=0 



where X(q) denotes the decision threshold that satisfies 

fy |h(A 1 1) _ c w q 
W(A|0) " coi(l-g)' 



(4) 



(5) 



B. Blake, the Second Agent 

Blake observes Alexis's decision. Considering Hi as an- 
other corrupted signal of H like Y2, he modifies the likelihood 
function <|3j with his prior belief q 2 to 

UMH^M}) S2( | 2)=1 c w q 2 
fy 2 ,S 1 \H(y2M\0) H 2 (y 2)=0 coiC 1 "^)' 
In the left-hand side of (J51, 

fy 2 ,H 1 \H(y2i h i\ h ) = fy 2 \H(y2\h)pft i \ H (hi\h) 

because the private signals Y\ and Y 2 are conditionally inde- 
pendent given H. We can rewrite (|6]l a^] 

fy 2 \ H (y2\l) R * { f =1 cipgg Phi\hO*\V)* 
fy 2 \ H (y2\0) ffa( J, )=0 coi(l-<fe)p ffi | ff (£i|l) B " 

4 The subscript "B" in the term "PJj 1 \h^ 11 I '0b" indicates the value of 
PBi I hO^ 1 I 'O mat Blake (the second agent) thinks. We specify this because 
Blake does not know Alexis's belief q\, Thus, he interprets her decision based 
on his belief q^. The value is different from the true value of Pg 1 1 H (hi \ h) = 
Pi?i|ff( /i ll / 0A- Of course, it will be also different from what Chuck — 

the third agent — thinks, which is denoted by Ph 1 \h(^ 11 I'Oc ^ n ' s w '" ^ e 
explained in the next paragraph. 
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The likelihood ratio test |7} can be interpreted with Blake 
updating his prior belief upon observing Alexis's decision Hi. 
Combined with q 2 , p Bi . H (hi \ h)% updates his prior belief 
from q 2 to q^\ 



q A 2 q 2 Pfi^H^llO), 



1-92 1 -^ PgilH (h 1 \l) s 
The updated belief is 

Q2PH 1 \ H (hl\0) B 



(8) 



12 



q2Pff t \{j( h i\ °) B + C 1 - 92)^ \ H (hi 1 1) B 



(9) 



We have to make clear that Blake does not correctly 
compute pg, s (hi\h) B because he does not know q\. The 
true probability is given by 

p Si ]H (0 1 h) = p Bi {H (0 1 h) A = P({Y 1 < X( qi )\H=h}) 

fY\ H (y\h)dy, (10) 



J -o 



but Blake evaluates Alexis's decision Hi as if it were made 
based on q 2 not qi\ 

p nilH (0\h) B =P({Yi<X(q 2 )\H=h}) 
-A(g 2 ) 

f Y \H(y\h)dy. (11) 



J —o 



An interesting observation is that Alexis's biased belief qi 
does not affect Blake's belief update. There is no trace of 
qi in |9]) and (JTTJ. Suppose that Alexis knows true prior 
probability p Q and uses the decision threshold \(po). Still 
Blake, who does not know what belief Alexis has, thinks that 
the conditional probability of Alexis declaring Hi = is given 
by (jTTJ and updates his belief as in |9]). It is clear in Q that 
the updated belief depends only on Blake's initial belief and 
Alexis's decision. 

However, Alexis's prior belief still affects Blake's perfor- 
mance in some way, which is related to the probability of error. 
Alexis's biased belief changes the probability of her decision. 
The changed probability is embedded in the probability of 
Blake's decision: 

Ph 2 \h(^2\0) = Y,PS2,S 1 \Hih2,hi\0) 

hi 

= PS 2 \S 1 , H (h2\0,0) B > ( p SilH (0\0) A 

+ PH 2 I 81 ,H (M I 1 . ) b X PH 1 I H ( 1 I ) a . 

Ph 2 \h(^2\1) = T.Ph^h^h^M |1) 

hi 

= P5 2 |5 1 ,h(^2|0,1) b xp ffi|H (0|l) A 

+ Ph 2 \8i,h(^2 1 1, 1) B * P81 1 H C 1 1 1)a- 

Thus, Alexis's biased belief changes the probability of Blake's 
decision as well as that of her decision. 



C. Chuck, the Third Agent 

Chuck's detection process is the same as Blake's. He 
observes both Alexis's and Blake's decisions and also updates 
his prior belief q% like in (|8): 

g^ B q 3 Pg 2 ,j? 1 |gfe,fel|0) c 

i-9 3 AB = i-q3p R2RilH (h 2 X\i) c 

( 93 PSi I H (h I °)c\Ph 2 \8 u H(M\h>°)c 



\ 1 - 93 p Si lH (hi\ l) c / p a2 1 8 uH (h2 I hi, l) c 

Be careful that Hi and H 2 are not conditionally independent 
given H because Blake's decision H 2 depends on Alexis's 
decision Hi'. 

Pj? 2 |i?i,ir(^2 1^1,0) *pjj 2W (h 2 \Q). 

Chuck's update process can be split into in two steps. The 
first step is to infer Blake's updated belief based on Alexis's 
decision: 



q 3 P8i\H@l\°)c 



(12) 



The second step is to update his own belief from q£ based on 
Blake's decision: 



!-?3 p nilH (hi\l) c 
)date his own belief fr 

9s PB 2 \Bi,H(h2\hi,0) Q 



(13) 



Please note that Chuck does not know Alexis's and Blake's 
prior beliefs, qi and q 2 , like Blake did not know Alexis's. Thus 
Chuck infers everything based on his own belief q 3 , which is 
indicated by the subscript "C" in ( fT2] > and (13) . 

Details of computations of ( fT~2] > and ( fT~3] > are as follows: 

/~A(g 3 ) 

Pfl 1 |H(0|^)c= / fY 1 \ H {y\h)dy, (14a) 

f 00 

PS 1 ih(1|^)c= L /y^G/l^- d4b) 

Substituting ( fl4| ) in ( p~2] >, Chuck can compute #3 for If 1 = 
and Hi = 1 respectively: 



9a 



93 



93 + (1 - 93) 



/- A i' 3) /V t I H (1/11)^1/ 

f-i" 3) fY 1 iH(y\0)dy 
93 



93 + (1 - 93) 



A7<, 3 )./Vi|g(yl 1 ) d 2/ 



(15a) 
(15b) 



Then, 



P8 2] 8 uH (°\ h ^ h h = n{Y2 < X(q A 3 )\H=h}) 

fY 2 \H(y\h)dy, (16a) 



J -o 



P8 2 {SiM 1 1 h i> h h = P (i y 2 > KQ A 3 )\H=h}) 

= J A J Y2lH (y\h)dy. (16b) 



A(fl£) 



Even though the value of may not seem to be used in ( fTB} , 
it is inherent in and affects the computation results. Chuck's 
updated belief q^ B is obtained by substituting ( fl5] > and ( fT6) in 
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0.2 0.4 0.6 0.8 1 

Initial prior belief of the fourth agent (q^) 

Fig. 2. The function U4 ( 94 , hi , Y12 , h-j ) — updated belief of the fourth agent 
(?4 BC ) — ror eacn possible combination of Alexis's, Blake's, and Chuck's 
decisions [hi /12 /13] when cio = cqi = 1 and private signals are distorted 
by additive Gaussian noise with zero mean and unit variance. The updated 
belief is mostly dependent on Chuck's decision; the top four curves are for 
/13 = and the bottom four curves are for /13 = 1. 



D. Norah, the Nth Agent 

Norah, the iVth agent, observes Yn and Hi, . . . , iJjv-i- 
Paralleling the arguments in the preceding sections, her prior 
belief update is a function of qpj as well as Hi,..., -ffjv-i but 
not of qi, ... , giv-i. Thus, we can define a general prior belief 
update function: 

^ab-m _ fj N (q N ^ h 1 ,h 2 ,. . . , fiN-i). 

The function U n has a recurrence relation: 

• For n = 1, Ui(q) = q. 

• For n > 1, 



U n (q,h ll . . .,h n -2,0) 



U n (q,hi, . . .,h n - 2 , 1) 



q+(l-q) 



f-l 9) fY n . l]H (y\0)dy 

q 



q> J? {9) fY rl _ 1 \H(y\0)dy 



q+(l 



where q= U n -i(q, hi, . . . , h n - 2 )- 
Fig. [^depicts the function Ui{qA, hi, h 2 , h^) for N = 4 for 
eight possible combinations of Alexis's, Blake's, and Chuck's 
decisions [hi hi h%\. An interesting property of U n is that 
the updated belief is much more dependent on the most recent 
decision h n ^i than on the earlier decisions hi, ... , h n _ 2 - This 
is especially the case when the (n-l)st agent has not followed 
precedent. This is because the nth agent rationally concludes 
that the (n - l)st agent observed strong evidence to justify a 
deviation from precedent. For example, if the decision history 
of the first five agents is [0 1] then the sixth agent 
takes the last decision 1 seriously even though four agents 
chose 0. A reversal of an arbitrarily long precedent sequence 
may occur because we assume unbounded private signals; if 
private signals are bounded like in Q, (3J, then the influence 
of precedent can reach a point where agents cannot receive a 
signal strong enough to justify a decision running counter to 
precedent. 



V. Optimal Initial Belief 
We have constructed the prior belief update and decision 



making model in Section IV In this section, we want to 
investigate when the system can achieve the minimum Bayes 
risk. For simplicity, we only consider N = 2. Note that the 
Bayes risk of the system is the same as Blake's Bayes risk 
because his decision is adopted as the final decision. 

Let us recapitulate the computation of Blake's Bayes risk. 
Alexis chooses her decision threshold as X 1 = X(qi). Her 
probabilities of errors are given by 

p Ii=Ph 1 \h( 1 \0)= I f Yl \H(y\0)dy, 
jx 1 

p I I i=Ph 1 \h(0\ 1 )= fy 1 \ H (y\l)dy. 
j — 00 

Blake thinks that Alexis uses the decision threshold A, = 
A(<72) an d computes her probabilities of errors differently: 

^,i B =Pff 1 |^(l|0),= r f Yl \H(y\0)dy, (17a) 

^i,=Pfi 1 |ir(0|l).= f K fr 1 \H(y\l)dy. (17b) 

When Alexis decides Hi = 0, Blake updates his belief q 2 

to q°: 



<72 1 " P e,l B 



Pl\ B 



q°2 



<72(l- J P e I .i B ) + (l-92)P e n lB 



(18) 



and his decision threshold is A2 = A^). His probabilities of 
errors are given by 

p l% = Ph 2 1 SuhO- 1 0, 0) = JJ f Y2 \u{y 1 0) dy, (19a) 

pl e,2 =PS 2 1^,^(0 1 0,1)= f Y2 \H(y\l)dy. (19b) 



Likewise, when Alexis decides Hi = 1, Blake updates his 
belief q 2 to q^: 



q 2 



92 P e,l B 



l-ql l~q 2 1-P" 



?2 



g 2 P e I , lB + (i-g2)(i-P e n lB )' 



(20) 



and his decision threshold is Xl 2 = \(q 2 ). His probabilities of 
errors are given by 

^,2=^1^(111.0)= L fY 2 \ H (y\0)dy, (21a) 

pl e.2 =Ph 2 \h u h(°\ 1 ' 1 ) = fY 2 \ H (y\l)dy. (21b) 

J — 00 
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Now we can compute Blake's Bayes risk R 2 : 

R2 = c w p B2>H (l,0) + c 01 p R2H (0, 1) 

(1|1,0) PSi|h (1|0)ph(0) 

+ cbiPff a |Sx,ff(ol 0,1^1^(011^(1) 

+ c oiPi? 2 1 ^,^(0 1 1) 1)^ h(1 I l)P^(l) 
= cio [i* 2 (l-i*i)+i* 2 i*i]tt> 

+ C01 [^2° ^'1 + ^'2 (1 " P l\)] (1 - Po)- (22) 



Bayes risk (Blake) 



The Bayes risk R 2 in p2| ) is a function of <?i and q 2 . It 
seems natural that R 2 is minimum at qi = q 2 = Po because 
Alexis will make the best decision she can and Blake will 
not misunderstand her decision. Surprisingly, however, this 
turns out not to be true. We will prove it by studying Alexis's 
optimal belief q\ with respect to minimizing R 2 . 

Let us consider the first derivative of |22} with respect to 

qi ■ 

We want to find qi that makes this first derivative zero. Using 

dP l,i dp l,i d\ d\ 1 
dqi d\ 1 dqi Y i\ H 1 ( }q 1 ' 

dp l\ dpl e,i d\ i dX t 

dqi dX x dqi Y± \ H 1 dqi ' 

this occurs when 

CloPotPeV^e^l^llO) 

= cm (1 - po) (P e n 2 ° " P% )f Yl I H ( A x I 1) 
/*!*(*! 1 1) c wPo( p h- p l%) 



f Yl iH(A a 1 o) C01 (i - Po )(p e n « - p^y 

Note that X x = X(qi) is a solution to Q, 

/n|g(Ai 1 1) _ cipgi 
/^^(AJO) " coi(l-gi)' 

Therefore Alexis's optimal belief ql needs to satisfy 

i -9r(i-Po)(p e ii 2 °-p e ii 2 1 )' 



(23) 



(24) 



where the value of (P^ 2 - P e I ° 2 )/(P e n 2 ° - P^ 1 ) does not have 
to be 1. In fact, in additive Gaussian noise cases, it is not equal 
to one except for po = coi/(cio + C01). Therefore, the optimal 
value of qi is not po in general. 

VI. Example: Gaussian Likelihoods 

Suppose that the nth agent receives the signal Y n = H + W n 
where the additive noises W n are iid with pdf 



fw(w) 



1 



-w' 2 /2 



0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5 
Alexis' belief (q ) 



Fig. 3. Visualization of the Bayes risk for various q± and 92 for N = 2, 
C10 = coi = 1, p = 0.3, and additive Gaussian noise with zero mean and unit 
variance. Alexis's and Blake's optimal prior beliefs (A) are different from the 
true prior probability (•). 



Each likelihood fy n \H is thus Gaussian with mean H and 
variance a 2 . For a prior belief q n , the likelihood ratio test 



f Yn \ H (yn\l) H '^ )=1 c w q n 

fv n \H{yn 1 0) nn{ y n)=0 C0l(l - Qn) 
is simplified to the following comparison: 



(26) 



Vn < \, 



1 , -1 Cio9, 
- + log ; 

2 6 c i(l- 



(27) 



>2it 



(25) 



Fig. [3] clearly shows that knowing true prior probability 
is not optimal. We have computed the performance of the 
sequential decision making by two agents for cio = cqi = 1, 
p = 0.3, and additive Gaussian noise with zero mean and unit 
variance. The Bayes risk is minimum when Alexis perceives 
the prior probability as 0.38 and Blake perceives it as 0.23, 
which is marked with a triangle. For convenience, the true 
probability is indicated by a circle. 

Figs. |4] and [5] depict the prior beliefs that the agents should 
have for optimal decision making. They show several common 
characteristics for the additive Gaussian noise model: First, 
the non-terminal agents (i.e., Alexis for N = 2 and Alexis and 
Blake for N = 3) should have belief larger than p when p is 
small and belief smaller than po when po is large. We call this 
open-mindedness because it is to assign higher prior belief to 
outcomes that are very unlikely. Second, the last agent (i.e., 
Blake for N = 2 and Chuck for N = 3) should have belief 
smaller than p when p is small and belief larger than p 
when po is large. This is necessary to compensate for the 
biases of precedent agents. Last, there is a unique point, except 
for po = or po = 1, where all agents' optimal prior beliefs 
are the same as the true prior probability. It occurs at po = 
c oi/( c io + c oi)- We prove this for N = 2 in the appendix. We 
also show there for N = 2 that the first agent should be open 
minded. 
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-Alexis (q i ) 
-Blake (q 2 ) 
True p 



True prior probability (p^) 

(a) 



Alexis (q^) 




Blake (q 2 ) 




True p 






/ 



True prior probability (p^) 

(b) 



Fig. 4. The trend of the optimal prior beliefs for varying po for N 
(Alexis and Blake), (a) eirj = Coi = 1. (b) cjq = 1, Crji = 3. 




(a) 




(b) 



Fig. 6. The difference between the left-hand side and the right-hand side of 
J251. (a) hi = 1. (b) /ii = 0.5. 



Alexis (q i ) 




Blake (q ) 




Chuck (q 3 ) 




True p 









Fig. 5. 

(Alexis, 



True prior probability (p Q ) 

(a) 



Alexis (q i ) 




Blake (q ) 




Chuck (q 3 ) 




True p 









True prior probability (p Q ) 

(b) 



The trend of the optimal prior beliefs for varying po for N = 3 
Blake, and Chuck), (a) ciq = cqi = 1. (b) cio = 1, coi = 3. 



taneously reflecting an agent's belief. Alexis's decision will 
reflect her belief more than her private signal when her belief 
is very small (close to 0) or very large (close to 1). However, 
Blake would want a public signal that is most informative of 
Alexis's private signal. Therefore, he wants her to make her 
decision with a less extreme belief or an open mind. 

While some conclusions of our study depend on having 
Gaussian likelihoods and may not hold for different types of 
additive noise, it is more generally true that the optimal prior 
beliefs are different from the true prior probability. 



VII. Conclusion 

We have discussed decision making sequentially performed 
by a group of agents that make decisions based on individually 
biased prior beliefs. Instead of investigating herding on a 
wrong action, we have assumed unbounded private signals 
and focused on the agents' belief update. The Bayes-optimal 
updated belief turns out to be the probability of each hypoth- 
esis conditioned on the decisions made by previous agents. 
It gets more difficult for agents in later positions to make 
decisions that differ from precedents; however, if one observes 
a very strong signal against the precedents and chooses the 
alternative, the decision will be taken very seriously by the 
following agents. 

The wrong beliefs held by previous agents change the prob- 
ability that following agents choose each hypothesis. Contrary 
to intuition, however, wrong beliefs are not always bad. In fact, 
the optimal beliefs of agents (those that lead to the minimum 
Bayes risk for the last agent) are not usually equal to the 
true prior probability. Specifically, in the case of observations 
corrupted by iid additive Gaussian noises, an agent biased 
toward cqi/(cio + coi) can be more beneficial to subsequent 
agents than is an accurate agent is. The point coi/(cio + Coi) 
is special because the probabilities of false alarms and missed 
detections will be balanced by the optimal decision rule at the 
prior probability. In terms of human decision making, where 
precedent agents are advisers or counselors to the last agent 
who has the final decisive power, we can say that the best 
advisers are necessarily open-minded people. 

The idea of an open-minded adviser is related to the amount 
of information conveyed in the public signals. A public signal 
is a quantized version of a private signal while also simul- 



Appendix 

Alexis's Optimal Prior Belief for Blake 

For the case of N = 2, we investigate Alexis's prior belief 
that minimizes Blake's Bayes risk. Let us assume that 



f Y \ H (y\o) 



fy\s(y\l) 



l 



V2 



: exp 



V2 



exp 1 



la 2 /' 



(28a) 
(28b) 



where a and h\ are arbitrary positive numbers. 
Conjecture 1. If \< hi/2, then 



£exp(4 + A/^^xp(4 + A/ ll ) 



(29) 



Fig. [6] depicts the difference between the left-hand side and 
the right-hand side of ( [29] ) for two values of hi and supports 
the conjecture. In the following we assume the conjecture to 
be true. 



Lemma 2. If A < ft-i/2, then 
V 2 , Xh 



r x ( y 2 ^yhi\ , r°° ( y 2 ^ yhA 



dy 

dy. 
(30) 
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Proof: Substituting y' = y/a, X' = A/cr, and h[ = h[/a, 
we obtain 

C x ( V 2 XhA , r°° I y 2 \h x \ , 
= o 2 J ex p|~^T + X ' h ij d V f y exp (~^T + X ' h ' 1 j dy ' 



and 



f x ( y 2 ^ yhA r°° / y 2 yhA 

= a 2 exp|-^ + y'h'A dy' JJ expL?! + y'h'A dy' . 



Since A < hi/2 implies A' < h[/2, ( pO) follows from Conjec- 
ture [T] ■ 

Theorem 3. Alexis's and Blake's optimal prior beliefs are 
equal to the true prior probability p if Po = Coi/(cio + Cqi). 

Proof: We will show that dR 2 /dqi = and dR2/dq2 = 
for qi = q2 = Po = Coi/(cio + cqi). Then they are Alexis's and 
Blake's optimal prior beliefs q\ and q 2 . 
First, consider dR 2 /dq 2 using ( f2"2"l >: 

9^2 



-ciopo 
+ coi(l-po) 



dA^ 



(1 - Fi A )/y a | h (A° 2 I 0) ^ + P e : a /y 2 |jf(Aj ] 0) ^ 
dA° 



dAj 

rf<J2 
(31) 



From JTHl and d27>, 



A° =A lB+ log 



1-P 



e.lB 



P 



and its derivative is given by 

dX° dX, dPl, i 



dP 



dq 2 dq 2 dq 2 1 - p\ . dg 2 P" 
, /Yi|H(A lB lO) / yi | ff (A lB |l) 

1 + TT 



l-Pt 



P 



d\ 

J_B 

dq 2 



(32) 



Likewise, 



A 2 = A, + log 



P 1 

1-P" 



and its derivative is 

dX 





dq 2 



1 ^(AJO) | /^(Ajl) 



P. 



1~P„ 



1b 

dq 2 



(33) 



In addition, 51 = q 2 implies that P\ 1b = P\ t and Pg\ B 
P e 1 , and we can derive the following relations for q% = q 2 
Po- 

/y 2 |g(A° 2 |l) c w g 2 (l-P l eM ) c wPa (l-P l el ) 
fv 2 \HA" 2 10) " c 01 (l - g 2 )P e n lB " c 01 (l-po)P e n i ' 
/y 2 |H(A 2 |l) cio^P^^ cioPoPj,! 



By substituting ( |32] > and ( |33] l in pTj ) and using the rela- 
tions (34), we obtain that dR 2 /dq 2 = at qi = q 2 = po. 

Next, we consider dR 2 /dqi, which is zero at q x and q 2 that 
satisfy d24b, 



Po(Pl] 2 -Pl a 2 ) 



(l-Po)(P e n 2 -P e n 2 1 )' 
The condition g 2 = Cqi/(cio + Cqi) leads to A 1 = and 

P e,l B = P "lB- HeI1Ce ' A 2" A 1 B = A 1b" A 2 aIld A 2 = l ~>^\- T^ 11 ' 

from (19) and (|2TJ, we obtain P l e ° 2 = P* 1 * and P\\ = P e n 2 °. 
Therefore, only q 2 = po completes (24) and makes dR 2 /dqi 
zero. ■ 

Theorem 4. Le? po 6 (0, 1) denote the true prior probability 
and ql Alexis's (i.e., the first agent's) optimal prior belief. 

. If pa < c i/(ci + cm), then p < q{ < c i/(c w + c i). 

. Ifpo = c i/(cio + coi), f/zen g x = p - 

. 7f po > c i/(cio + coi), f/ien c i/(ci + c i) < g x < p - 

Proof: First, the proof for the case when po = 
c oi/( c io + °oi) is given in Theorem [3] 

Next, suppose that p < c i/(c w + coi). Let A in ( f30] > denote 
a decision threshold according to q 2 . Obviously, optimal prior 
beliefs q{ and q 2 should be strictly decreasing as po decreases 
like in Fig. [5] Hence, Theorem [5] which states that q* = q 2 = 
W( c io + coi) if Po = W( c io + coi), implies that 



Qi < 



and q 2 < 



coi 



(35) 



ClO + Coi ^" C10 + c i 

if po < c oi/ (cio + c oi)- Then we get A < /ii/2 and can use ( [30] >. 
Multiplying each integrand in (f30]> by the constant 



X 



1 



-°o 27T(T 2 



dy 



L 

I 



1 



27TCT 2 

A 1 

-00 27T(7 2 



/ {X-h x f \ ( y* \ 
XP (" iA 2^ L ) eXP (-2^)^ 



According to ( |28[ l, the exponential functions in ( |36| l are 
likelihood functions of Y±, so we have 

f Yl \ H W)dy / f Yl \ H (y\o)dy 

/A r 00 

00 *y a 



(37) 



/r 2 | ff (A 2 |0) C01 (l-g 2 )(l-P e n lB ) c 01 (l-p )(l-P e II 1 ) 

(34) 



Since A = A(^2), using ( ff7| >, we obtain 

/niH^I ) /^^(All) ' 
and |5| transforms ( |38| to 

4)92^(1 " P^b) < c oi(l - ^) 2 P e n i B (l - ^le) 



(38) 
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Fig. 7. The point B 



Wr°2> 'r's) always exists between the points B 1 



prior belief in Section [V] We can rewrite it as 

Po 



p 11 0_p 1 

Po + (l~Po) X D ', 



(42) 



Finally, we can conclude that > p because of ( pT) . 

In addition, Alexis's optimal belief qr* is upper-bounded by 
coi/(cio + coi) because is strictly decreasing in po and 
<7i = coi/(cio + cox) when p = c i/( c io + c i) by Theorem|3] 
Combining these two bounds, we have the inequality 

Po < gi* < CQ1 , (43) 

ClO + Coi 

as desired. 

The statement that Cqi/(cio + Coi) < <?* < po if Po > 
coi/(cio + c oi) can be proven similarly. ■ 
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Rearranging terms gives us 



c io<7 2 



l-Pi 



Cio9 2 



P 



e,l B 



coi(l-9 2 *) Pi 



c i(l-^) 1-P" 



(39) 



The terms in the left-hand and the right-hand sides are the 
same as the belief update formulae ( fl8] l and ([20j>, so this 
simplifies to 



Ui(i-^)j 



cioga 
coi(l-9a) 



(40) 



Let us discuss the meaning of the inequality ( |40| >. In 
Fig. [7] the convex curve depicts a flipped version of the 
receiver operating characteristic (ROC). When the prior belief 
is q, the error probabilities (P^i-fJ^) aTe determined as the 
point of tangency, where the curve meets a line with slope 
-cio<?/coi(l - q). Two solid lines in Fig. |7j depict the lines for 
Blake's updated beliefs after observing H\ = and Hi = 1, 
respectively denoted by q% and q\. 

The inequality ( ftOj i restricts the range of error probabilities 
in which (P l e ° 2 ,P^f) can exist on the basis of (P^P^); 
the point B° (Pl%, Pl l %), a black dot, always exists on the 
right side of the point B 1 (Pj^iPj^)* a § ra y diamond. 
Furthermore, the point B° cannot exist on the right side of 
the point B 1 (P* 1 * , Pj^ 1 ), a black diamond, because obviously 
q% > ?2- Therefore, the point B° always exists on the curve 
between the points B 1 and B 1 . 

Now we draw a black dotted line that connects the points 
B° and B 1 and a gray dashed line that connects the points B 1 
and B 1 . From the restriction for the point B°, the slope of the 
former is always greater than that of the latter: 



piii 

e,2 



pllo 

e,2 



> -1. 



(41) 



We have obtained the optimality condition d24} for Alexis's 
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